Overview

Dataset statistics

Number of variables22
Number of observations1000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory688.2 KiB
Average record size in memory704.7 B

Variable types

NUM12
CAT8
BOOL2

Reproduction

Analysis started2020-06-20 07:49:14.887785
Analysis finished2020-06-20 07:49:43.334721
Duration28.45 seconds
Versionpandas-profiling v2.7.1
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml
NumCompaniesWorked has 130 (13.0%) zeros Zeros
TrainingTimesLastYear has 34 (3.4%) zeros Zeros
YearsAtCompany has 33 (3.3%) zeros Zeros
YearsInCurrentRole has 169 (16.9%) zeros Zeros
YearsSinceLastPromotion has 396 (39.6%) zeros Zeros
YearsWithCurrManager has 182 (18.2%) zeros Zeros

Variables

Age
Real number (ℝ≥0)

Distinct count43
Unique (%)4.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.883
Minimum18
Maximum60
Zeros0
Zeros (%)0.0%
Memory size15.6 KiB

Quantile statistics

Minimum18
5-th percentile24
Q130
median36
Q343
95-th percentile54
Maximum60
Range42
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.125052196
Coefficient of variation (CV)0.2474053682
Kurtosis-0.4503897456
Mean36.883
Median Absolute Deviation (MAD)6
Skewness0.3763870168
Sum36883
Variance83.26657758
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
35 56 5.6%
 
34 49 4.9%
 
29 45 4.5%
 
31 44 4.4%
 
36 43 4.3%
 
38 43 4.3%
 
30 42 4.2%
 
40 40 4.0%
 
32 37 3.7%
 
33 36 3.6%
 
Other values (33) 565 56.5%
 
ValueCountFrequency (%) 
18 5 0.5%
 
19 7 0.7%
 
20 7 0.7%
 
21 10 1.0%
 
22 9 0.9%
 
ValueCountFrequency (%) 
60 3 0.3%
 
59 8 0.8%
 
58 7 0.7%
 
57 3 0.3%
 
56 5 0.5%
 

Attrition
Boolean

Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
0
843
1
 
157
ValueCountFrequency (%) 
0 843 84.3%
 
1 157 15.7%
 

BusinessTravel
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Travel_Rarely
709
Travel_Frequently
199
Non-Travel
 
92
ValueCountFrequency (%) 
Travel_Rarely 709 70.9%
 
Travel_Frequently 199 19.9%
 
Non-Travel 92 9.2%
 

Length

Max length17
Mean length13.52
Min length10
ValueCountFrequency (%) 
Lowercase_Letter 11 64.7%
 
Uppercase_Letter 4 23.5%
 
Dash_Punctuation 1 5.9%
 
Connector_Punctuation 1 5.9%
 
ValueCountFrequency (%) 
Latin 15 88.2%
 
Common 2 11.8%
 
ValueCountFrequency (%) 
ASCII 17 100.0%
 

DistanceFromHome
Real number (ℝ≥0)

Distinct count29
Unique (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean9.145
Minimum1
Maximum29
Zeros0
Zeros (%)0.0%
Memory size15.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median7
Q313
95-th percentile26
Maximum29
Range28
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.120955912
Coefficient of variation (CV)0.8880214229
Kurtosis-0.1030362226
Mean9.145
Median Absolute Deviation (MAD)5
Skewness1.008336875
Sum9145
Variance65.94992492
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2 143 14.3%
 
1 142 14.2%
 
9 65 6.5%
 
7 62 6.2%
 
10 58 5.8%
 
3 52 5.2%
 
8 52 5.2%
 
4 45 4.5%
 
5 44 4.4%
 
6 44 4.4%
 
Other values (19) 293 29.3%
 
ValueCountFrequency (%) 
1 142 14.2%
 
2 143 14.3%
 
3 52 5.2%
 
4 45 4.5%
 
5 44 4.4%
 
ValueCountFrequency (%) 
29 21 2.1%
 
28 17 1.7%
 
27 9 0.9%
 
26 18 1.8%
 
25 17 1.7%
 

EducationField
Categorical

Distinct count6
Unique (%)0.6%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Life Sciences
403
Medical
333
Marketing
105
Technical Degree
82
Other
 
57
ValueCountFrequency (%) 
Life Sciences 403 40.3%
 
Medical 333 33.3%
 
Marketing 105 10.5%
 
Technical Degree 82 8.2%
 
Other 57 5.7%
 
Human Resources 20 2.0%
 

Length

Max length16
Mean length10.412
Min length5
ValueCountFrequency (%) 
Lowercase_Letter 17 65.4%
 
Uppercase_Letter 8 30.8%
 
Space_Separator 1 3.8%
 
ValueCountFrequency (%) 
Latin 25 96.2%
 
Common 1 3.8%
 
ValueCountFrequency (%) 
ASCII 26 100.0%
 
Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
4
308
3
304
2
196
1
192
ValueCountFrequency (%) 
4 308 30.8%
 
3 304 30.4%
 
2 196 19.6%
 
1 192 19.2%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 4 100.0%
 
ValueCountFrequency (%) 
Common 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

JobInvolvement
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
3
593
2
259
4
 
94
1
 
54
ValueCountFrequency (%) 
3 593 59.3%
 
2 259 25.9%
 
4 94 9.4%
 
1 54 5.4%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 4 100.0%
 
ValueCountFrequency (%) 
Common 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

JobRole
Categorical

Distinct count9
Unique (%)0.9%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Sales Executive
217
Research Scientist
209
Laboratory Technician
166
Manufacturing Director
97
Healthcare Representative
90
Other values (4)
221
ValueCountFrequency (%) 
Sales Executive 217 21.7%
 
Research Scientist 209 20.9%
 
Laboratory Technician 166 16.6%
 
Manufacturing Director 97 9.7%
 
Healthcare Representative 90 9.0%
 
Manager 74 7.4%
 
Sales Representative 64 6.4%
 
Research Director 47 4.7%
 
Human Resources 36 3.6%
 

Length

Max length25
Mean length18.024
Min length7
ValueCountFrequency (%) 
Lowercase_Letter 20 69.0%
 
Uppercase_Letter 8 27.6%
 
Space_Separator 1 3.4%
 
ValueCountFrequency (%) 
Latin 28 96.6%
 
Common 1 3.4%
 
ValueCountFrequency (%) 
ASCII 29 100.0%
 

JobSatisfaction
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
3
321
4
306
1
188
2
185
ValueCountFrequency (%) 
3 321 32.1%
 
4 306 30.6%
 
1 188 18.8%
 
2 185 18.5%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 4 100.0%
 
ValueCountFrequency (%) 
Common 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

MaritalStatus
Categorical

Distinct count3
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
Married
469
Single
314
Divorced
217
ValueCountFrequency (%) 
Married 469 46.9%
 
Single 314 31.4%
 
Divorced 217 21.7%
 

Length

Max length8
Mean length6.903
Min length6
ValueCountFrequency (%) 
Lowercase_Letter 11 78.6%
 
Uppercase_Letter 3 21.4%
 
ValueCountFrequency (%) 
Latin 14 100.0%
 
ValueCountFrequency (%) 
ASCII 14 100.0%
 

MonthlyIncome
Real number (ℝ≥0)

Distinct count941
Unique (%)94.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6464.418
Minimum1009
Maximum19999
Zeros0
Zeros (%)0.0%
Memory size15.6 KiB

Quantile statistics

Minimum1009
5-th percentile2120.7
Q12874
median4877.5
Q38393
95-th percentile17660.3
Maximum19999
Range18990
Interquartile range (IQR)5519

Descriptive statistics

Standard deviation4685.919516
Coefficient of variation (CV)0.7248787927
Kurtosis1.031928028
Mean6464.418
Median Absolute Deviation (MAD)2174.5
Skewness1.374173983
Sum6464418
Variance21957841.71
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5562 3 0.3%
 
2380 3 0.3%
 
2342 3 0.3%
 
2404 3 0.3%
 
2610 3 0.3%
 
2451 3 0.3%
 
3452 2 0.2%
 
17861 2 0.2%
 
2269 2 0.2%
 
2720 2 0.2%
 
Other values (931) 974 97.4%
 
ValueCountFrequency (%) 
1009 1 0.1%
 
1051 1 0.1%
 
1052 1 0.1%
 
1081 1 0.1%
 
1118 1 0.1%
 
ValueCountFrequency (%) 
19999 1 0.1%
 
19973 1 0.1%
 
19926 1 0.1%
 
19859 1 0.1%
 
19847 1 0.1%
 

NumCompaniesWorked
Real number (ℝ≥0)

ZEROS
Distinct count10
Unique (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.704
Minimum0
Maximum9
Zeros130
Zeros (%)13.0%
Memory size15.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q34
95-th percentile8
Maximum9
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.490499265
Coefficient of variation (CV)0.9210426274
Kurtosis0.02551822966
Mean2.704
Median Absolute Deviation (MAD)1
Skewness1.030068887
Sum2704
Variance6.202586587
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1 351 35.1%
 
0 130 13.0%
 
3 114 11.4%
 
2 104 10.4%
 
4 94 9.4%
 
7 50 5.0%
 
6 50 5.0%
 
5 38 3.8%
 
9 35 3.5%
 
8 34 3.4%
 
ValueCountFrequency (%) 
0 130 13.0%
 
1 351 35.1%
 
2 104 10.4%
 
3 114 11.4%
 
4 94 9.4%
 
ValueCountFrequency (%) 
9 35 3.5%
 
8 34 3.4%
 
7 50 5.0%
 
6 50 5.0%
 
5 38 3.8%
 

OverTime
Boolean

Distinct count2
Unique (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
No
716
Yes
284
ValueCountFrequency (%) 
No 716 71.6%
 
Yes 284 28.4%
 

PercentSalaryHike
Real number (ℝ≥0)

Distinct count15
Unique (%)1.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.238
Minimum11
Maximum25
Zeros0
Zeros (%)0.0%
Memory size15.6 KiB

Quantile statistics

Minimum11
5-th percentile11
Q112
median14
Q318
95-th percentile22
Maximum25
Range14
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.6431619
Coefficient of variation (CV)0.239083994
Kurtosis-0.2382419764
Mean15.238
Median Absolute Deviation (MAD)2
Skewness0.8359233835
Sum15238
Variance13.27262863
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
13 143 14.3%
 
14 135 13.5%
 
11 135 13.5%
 
12 134 13.4%
 
15 76 7.6%
 
18 61 6.1%
 
16 59 5.9%
 
17 56 5.6%
 
19 48 4.8%
 
22 36 3.6%
 
Other values (5) 117 11.7%
 
ValueCountFrequency (%) 
11 135 13.5%
 
12 134 13.4%
 
13 143 14.3%
 
14 135 13.5%
 
15 76 7.6%
 
ValueCountFrequency (%) 
25 14 1.4%
 
24 13 1.3%
 
23 20 2.0%
 
22 36 3.6%
 
21 36 3.6%
 

StockOptionLevel
Categorical

Distinct count4
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size15.6 KiB
0
428
1
413
2
99
3
 
60
ValueCountFrequency (%) 
0 428 42.8%
 
1 413 41.3%
 
2 99 9.9%
 
3 60 6.0%
 

Length

Max length1
Mean length1
Min length1
ValueCountFrequency (%) 
Decimal_Number 4 100.0%
 
ValueCountFrequency (%) 
Common 4 100.0%
 
ValueCountFrequency (%) 
ASCII 4 100.0%
 

TotalWorkingYears
Real number (ℝ≥0)

Distinct count39
Unique (%)3.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.229
Minimum0
Maximum38
Zeros7
Zeros (%)0.7%
Memory size15.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q16
median10
Q315
95-th percentile28
Maximum38
Range38
Interquartile range (IQR)9

Descriptive statistics

Standard deviation7.766651781
Coefficient of variation (CV)0.6916601461
Kurtosis0.8238865611
Mean11.229
Median Absolute Deviation (MAD)4
Skewness1.096541502
Sum11229
Variance60.32087988
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
10 133 13.3%
 
6 87 8.7%
 
8 73 7.3%
 
9 66 6.6%
 
5 58 5.8%
 
7 57 5.7%
 
1 56 5.6%
 
4 43 4.3%
 
12 36 3.6%
 
3 27 2.7%
 
Other values (29) 364 36.4%
 
ValueCountFrequency (%) 
0 7 0.7%
 
1 56 5.6%
 
2 25 2.5%
 
3 27 2.7%
 
4 43 4.3%
 
ValueCountFrequency (%) 
38 1 0.1%
 
37 4 0.4%
 
36 3 0.3%
 
35 2 0.2%
 
34 5 0.5%
 

TrainingTimesLastYear
Real number (ℝ≥0)

ZEROS
Distinct count7
Unique (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.841
Minimum0
Maximum6
Zeros34
Zeros (%)3.4%
Memory size15.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q12
median3
Q33
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.300542352
Coefficient of variation (CV)0.4577762592
Kurtosis0.4313947547
Mean2.841
Median Absolute Deviation (MAD)1
Skewness0.5676822032
Sum2841
Variance1.69141041
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2 362 36.2%
 
3 346 34.6%
 
5 89 8.9%
 
4 75 7.5%
 
6 48 4.8%
 
1 46 4.6%
 
0 34 3.4%
 
ValueCountFrequency (%) 
0 34 3.4%
 
1 46 4.6%
 
2 362 36.2%
 
3 346 34.6%
 
4 75 7.5%
 
ValueCountFrequency (%) 
6 48 4.8%
 
5 89 8.9%
 
4 75 7.5%
 
3 346 34.6%
 
2 362 36.2%
 

YearsAtCompany
Real number (ℝ≥0)

ZEROS
Distinct count36
Unique (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.979
Minimum0
Maximum37
Zeros33
Zeros (%)3.3%
Memory size15.6 KiB

Quantile statistics

Minimum0
5-th percentile1
Q13
median5
Q39
95-th percentile20
Maximum37
Range37
Interquartile range (IQR)6

Descriptive statistics

Standard deviation6.232608154
Coefficient of variation (CV)0.8930517487
Kurtosis3.872948107
Mean6.979
Median Absolute Deviation (MAD)3
Skewness1.783427137
Sum6979
Variance38.8454044
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5 135 13.5%
 
1 119 11.9%
 
2 88 8.8%
 
3 87 8.7%
 
10 77 7.7%
 
4 75 7.5%
 
7 60 6.0%
 
9 55 5.5%
 
8 54 5.4%
 
6 51 5.1%
 
Other values (26) 199 19.9%
 
ValueCountFrequency (%) 
0 33 3.3%
 
1 119 11.9%
 
2 88 8.8%
 
3 87 8.7%
 
4 75 7.5%
 
ValueCountFrequency (%) 
37 1 0.1%
 
36 2 0.2%
 
34 1 0.1%
 
33 5 0.5%
 
32 1 0.1%
 

YearsInCurrentRole
Real number (ℝ≥0)

ZEROS
Distinct count19
Unique (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.225
Minimum0
Maximum18
Zeros169
Zeros (%)16.9%
Memory size15.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q37
95-th percentile11
Maximum18
Range18
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.698114526
Coefficient of variation (CV)0.8752933791
Kurtosis0.543232248
Mean4.225
Median Absolute Deviation (MAD)3
Skewness0.9656815817
Sum4225
Variance13.67605105
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2 249 24.9%
 
0 169 16.9%
 
7 147 14.7%
 
3 95 9.5%
 
4 73 7.3%
 
8 56 5.6%
 
9 45 4.5%
 
1 44 4.4%
 
10 22 2.2%
 
6 21 2.1%
 
Other values (9) 79 7.9%
 
ValueCountFrequency (%) 
0 169 16.9%
 
1 44 4.4%
 
2 249 24.9%
 
3 95 9.5%
 
4 73 7.3%
 
ValueCountFrequency (%) 
18 2 0.2%
 
17 3 0.3%
 
16 5 0.5%
 
15 5 0.5%
 
14 9 0.9%
 

YearsSinceLastPromotion
Real number (ℝ≥0)

ZEROS
Distinct count16
Unique (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.152
Minimum0
Maximum15
Zeros396
Zeros (%)39.6%
Memory size15.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile9
Maximum15
Range15
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.199864862
Coefficient of variation (CV)1.486926051
Kurtosis3.945854823
Mean2.152
Median Absolute Deviation (MAD)1
Skewness2.048823994
Sum2152
Variance10.23913514
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0 396 39.6%
 
1 243 24.3%
 
2 116 11.6%
 
7 50 5.0%
 
4 44 4.4%
 
3 31 3.1%
 
5 28 2.8%
 
6 23 2.3%
 
11 15 1.5%
 
8 11 1.1%
 
Other values (6) 43 4.3%
 
ValueCountFrequency (%) 
0 396 39.6%
 
1 243 24.3%
 
2 116 11.6%
 
3 31 3.1%
 
4 44 4.4%
 
ValueCountFrequency (%) 
15 9 0.9%
 
14 7 0.7%
 
13 7 0.7%
 
12 6 0.6%
 
11 15 1.5%
 

YearsWithCurrManager
Real number (ℝ≥0)

ZEROS
Distinct count18
Unique (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.058
Minimum0
Maximum17
Zeros182
Zeros (%)18.2%
Memory size15.6 KiB

Quantile statistics

Minimum0
5-th percentile0
Q12
median3
Q37
95-th percentile10.05
Maximum17
Range17
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.59883142
Coefficient of variation (CV)0.8868485509
Kurtosis0.3310627141
Mean4.058
Median Absolute Deviation (MAD)3
Skewness0.9057826493
Sum4058
Variance12.95158759
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
2 232 23.2%
 
0 182 18.2%
 
7 145 14.5%
 
3 96 9.6%
 
8 70 7.0%
 
4 67 6.7%
 
1 62 6.2%
 
9 39 3.9%
 
5 21 2.1%
 
10 19 1.9%
 
Other values (8) 67 6.7%
 
ValueCountFrequency (%) 
0 182 18.2%
 
1 62 6.2%
 
2 232 23.2%
 
3 96 9.6%
 
4 67 6.7%
 
ValueCountFrequency (%) 
17 5 0.5%
 
16 1 0.1%
 
15 5 0.5%
 
14 5 0.5%
 
13 10 1.0%
 

CommunicationSkill
Real number (ℝ≥0)

Distinct count5
Unique (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.041
Minimum1
Maximum5
Zeros0
Zeros (%)0.0%
Memory size15.6 KiB

Quantile statistics

Minimum1
5-th percentile1
Q12
median3
Q34
95-th percentile5
Maximum5
Range4
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.413972531
Coefficient of variation (CV)0.4649695926
Kurtosis-1.296248835
Mean3.041
Median Absolute Deviation (MAD)1
Skewness-0.0428380009
Sum3041
Variance1.999318318
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
5 207 20.7%
 
4 206 20.6%
 
3 201 20.1%
 
2 193 19.3%
 
1 193 19.3%
 
ValueCountFrequency (%) 
1 193 19.3%
 
2 193 19.3%
 
3 201 20.1%
 
4 206 20.6%
 
5 207 20.7%
 
ValueCountFrequency (%) 
5 207 20.7%
 
4 206 20.6%
 
3 201 20.1%
 
2 193 19.3%
 
1 193 19.3%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

AgeAttritionBusinessTravelDistanceFromHomeEducationFieldEnvironmentSatisfactionJobInvolvementJobRoleJobSatisfactionMaritalStatusMonthlyIncomeNumCompaniesWorkedOverTimePercentSalaryHikeStockOptionLevelTotalWorkingYearsTrainingTimesLastYearYearsAtCompanyYearsInCurrentRoleYearsSinceLastPromotionYearsWithCurrManagerCommunicationSkill
0300Non-Travel2Medical33Laboratory Technician4Single25640No140122117674
1360Travel_Rarely12Life Sciences33Manufacturing Director3Married46639Yes1227232112
2551Travel_Rarely2Medical33Sales Executive4Single51604No16012397735
3390Travel_Rarely24Life Sciences13Research Scientist4Single41087No13018277174
4370Travel_Rarely3Other33Manufacturing Director3Married94341No151102107781
5310Travel_Rarely7Life Sciences22Sales Representative3Married23293No15013277522
6321Travel_Rarely1Life Sciences42Laboratory Technician3Single37300Yes1404232121
7330Travel_Rarely4Medical12Laboratory Technician2Married38388No1108554025
8350Travel_Frequently11Marketing43Sales Executive4Divorced49681No1115352024
9211Travel_Rarely7Marketing23Sales Representative2Single26791No1301310105

Last rows

AgeAttritionBusinessTravelDistanceFromHomeEducationFieldEnvironmentSatisfactionJobInvolvementJobRoleJobSatisfactionMaritalStatusMonthlyIncomeNumCompaniesWorkedOverTimePercentSalaryHikeStockOptionLevelTotalWorkingYearsTrainingTimesLastYearYearsAtCompanyYearsInCurrentRoleYearsSinceLastPromotionYearsWithCurrManagerCommunicationSkill
990360Travel_Frequently6Life Sciences14Laboratory Technician1Married55623Yes1319332025
991340Travel_Frequently10Medical43Research Scientist3Divorced38151Yes1715453201
992390Travel_Rarely1Medical34Healthcare Representative4Divorced96130No1731951810373
993471Travel_Frequently9Life Sciences31Sales Executive3Married129367No11025323514103
994430Travel_Rarely7Life Sciences33Healthcare Representative1Married99858No16110110001
995360Non-Travel10Medical23Sales Executive4Single99801No140103103974
996400Travel_Rarely16Life Sciences33Manufacturing Director4Single79456Yes15018242332
997461Travel_Rarely9Medical32Sales Executive4Single96191No1609398474
998300Travel_Rarely2Medical32Manufacturing Director4Single68775Yes24012400005
999530Travel_Frequently2Marketing32Sales Executive2Married75252No1213021576123